Influence of protocol variables on outcomes of the star excursion balance test group (SEBT, mSEBT, YBT-LQ) in healthy individuals: a systematic review

Background The “SEBT group,” which includes the Star Excursion Balance Test (SEBT), its modified version (mSEBT), and the Lower Quarter Y-Balance Test (YBT-LQ), is used to assess the limits of stability. Interestingly, the testing protocol allows users a considerable degree of flexibility, which can affect the obtained results. Therefore, the objective of this systematic review was to analyze the impact of different protocol variants within the “SEBT group” on outcomes. Methods Data were acquired by searching 4 databases (MEDLINE, ScienceDirect, Wiley, Springer Link) focusing on studies published in English in peer-reviewed journals, empirical in nature, conducted on healthy individuals, and examining the effects of various protocol variants on test outcomes. Study quality was assessed with the NHLBI quality assessment tool for pre-post studies with no control group. Results The calculation method based on the maximum repetition yields statistically significantly higher results compared to other calculation methods. Allowing unrestricted arm movements during the test results in statistically significantly higher scores compared to the procedure that restricts arm movements. The impact of a warm-up, wearing footwear during testing, and using a dedicated kit remains ambiguous. To obtain reliable results, 4–6 familiarization trials are necessary, though fewer may suffice for athletes experienced in performing the test. Conclusion This systematic review highlights the significant impact of the calculation method and arm movement restrictions on the outcomes of the “SEBT group.” The effects of wearing footwear during testing, warm-up, and using a dedicated test kit remain unclear. The required number of familiarization repetitions may varies depending on biological maturity level of the person being tested. Future research should develop a warm-up protocol tailored to the needs of the “SEBT group,” and investigate the impact of heel elevation during testing on outcomes. Systematic review registration The protocol for this systematic review was prospectively registered in the OSF Registries (https://doi.org/10.17605/OSF.IO/JSKH2).


Introduction
Postural stability is the ability to actively maintain the vertical projection of body's center of gravity within the support area (Andreeva et al., 2021).One dimension of postural stability is the limits of stability (LoS), which define the ranges of the body's center of gravity shifts in various directions that do not lead to loss of balance (Melzer et al., 2008).A popular method of assessing LoS is "SEBT group," which includes the Star Excursion Balance Test (SEBT), its modified version (mSEBT), and the Lower Quarter Y-Balance Test (YBT-LQ).A major advantage of these tests is their relatively low cost and user-friendliness, making them accessible not just for large sports and rehabilitation centers but also for smaller physiotherapy practices and sports clubs.The test results are primarily used for assessing the risk of injury (Gribble et al., 2012;Plisky et al., 2021), evaluating the outcomes of interventions (Chaabene et al., 2021), and are also considered as criteria for returning to sports (Oleksy et al., 2021).All these applications are extremely valuable from a training practice perspective, as they provide coaches and instructors with key insights into an athlete's readiness and physical status.
The "SEBT group" directly measure the reach distance of the lower limbs (Kinzey and Armstrong, 1998;Plisky et al., 2009).The SEBT measures reach in 8 directions, whereas mSEBT and YBT-LQ are focused on 3 directions, utilizing a specialized test kit for the latter.The reduction in the number of directions in mSEBT and YBT-LQ stems from a desire to increase the test's efficiency and to eliminate redundancies (Plisky et al., 2009).By focusing on 3 key directions -anterior (ANT), posterolateral (PL), and posteromedial (PM) -mSEBT and YBT-LQ offers a quicker and more focused assessment, which still effectively assess LoS but in a more practical manner, especially suitable for clinical environments.In each of the tests involves the participant standing on one leg and reaching as far as possible with the opposite lower limb in the designated directions.From these tests, several outcomes are obtained: (a) absolute (in cm) and normalized reach (in % lower limb length); (b) absolute and normalized composite score; and (c) interlimb ratio of the outcomes mentioned in points a and b.
A very important characteristic of each test is its validity, which informs whether the test measures what it was designed to measure, and its reliability, which indicates whether the test consistently measures what it is intended to measure.Research indicates that the "SEBT group" have been quite thoroughly examined from this perspective.Research conducted by Plisky et al. (2021) revealed significant differences in "SEBT group" performance among populations, thereby emphasizing the discriminative validity of these tools.Conversely, the relationship between the results of the "SEBT group" and the risk of future injuries (predictive validity) remains unclear.Many indications suggest that injury risk prediction based on "SEBT group" results is justified only for specific populations (Plisky et al., 2021) after applying standardized cutoff values (Lehr et al., 2013).Glave et al. (2016), comparing SEBT results with the LoS test using the Biodex Balance System, found a negative correlation.This suggests that the testing of postural stability through these methods is highly specific, as participants who performed well on one test were likely to score poorly on the other, indicating the unique and distinct nature of each test's assessment of LoS.Furthermore, a systematic review conducted by Powden et al. (2019) demonstrated excellent inter-and intrarater reliability of YBT-LQ results in healthy adults, a crucial aspect indicating that the test outcomes are repeatable and consistent regardless of the evaluator (inter-rater reliability) or the timing of the assessment (intra-rater reliability), which is a necessary condition for utilizing this tool in clinical decision-making.
Interestingly, the testing procedure of the "SEBT group" allows users a considerable degree of flexibility, which can affect the results.This flexibility pertains to the choice of calculation method, restrictions on arm movements, wearing footwear during testing, warm-up, the number of familiarization repetitions, the use of a dedicated test kit, and restrictions on heel lifting.With this in mind, a review of studies analyzing the impact of these protocol variables can serve as a useful source of information for selecting the most optimal combination of variables for a given issue.Additionally, the review will provide data that can be used to estimate adjustments when comparing results obtained using different protocols.Therefore, this systematic review aimed to analyze the impact of different protocol variants within the "SEBT group" on outcomes.To the best of the authors' knowledge, this is the first study to address this issue.Its completion will result in the creation of a valuable source of information that will be useful from both a research and clinical practice perspective.

Protocol and registration
The protocol for this systematic review was prospectively registered in the OSF Registries (https://doi.org/10.17605/OSF.IO/ JSKH2).The systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (Page et al., 2021).

Search strategy and study selection
The systematic review was conducted in February 2024 by searching through 4 databases: MEDLINE (using PubMed search engine), ScienceDirect, Wiley and Springer Link, specifically targeting studies published after the year 1998 -the year of publication of the first work using SEBT (Kinzey and Armstrong, 1998).The search strategy, as detailed in Table 1, encompassed a wide range of terms related to YBT-LQ and SEBT, along with related procedural aspects.B.Z and M.O independently reviewed the titles and abstracts of the identified studies during the search, then analyzed the full texts of relevant studies, and finally compiled a list of qualified research.Next, both lists were compared and discussed.In case consensus could not be reached, the final decision was made (3) languages: English; (4) disciplines: medicine and public health, life sciences, biomedicine based on the opinion of 3rd author A.M. In managing the references, Mendeley Reference Manager was employed, facilitating efficient organization to the literature sources.

Eligibility criteria
For this systematic review, the following eligibility criteria were applied: (i) publication in English (full text) in a peerreviewed journal; (ii) cross sectional and experimental study design (review articles, editorials, speeches, comments, abstracts, case studies, and surgical procedures were not considered); (iii) comprising individuals of all ages who are healthy, with no history of major lower limb injuries or surgeries, and no diagnosed issues with postural control.In terms of the intervention (iv), the studies may explore the following aspects of the "SEBT group" protocol: choice of calculation method based on the maximum repetition, conducting test with restricted arm movements, performing the test in footwear, conducting a warm-up before testing, preceding test repetitions with 6 familiarization repetitions, using a dedicated test kit, and allowing heel lifting during testing.Regarding the comparator (v), it might include: choice of calculation method not based on the maximum repetition, conducting test without restricted arm movements, performing the test barefoot, not conducting a warm-up before testing, preceding test repetitions with a number of familiarization repetitions other than 6, not using of a dedicated test kit, and not allowing heel lifting during testing Finally, the outcome (vi) will focus on both absolute and normalized reaches, as well as a composite score.

Methodological quality assessment
The quality of the included studies was assessed using the NHLBI quality assessment tool for before-after (pre-post) studies without a control group (NHLBI, 2022).This evaluation was conducted independently by B.Z. and M.O.It involved selecting one of 5 options for each of 12 items: "yes," "no," "cannot determine/ unclear," "not reported," or "not applicable."The total score was calculated as the sum of "yes" responses divided by the number of eligible items, expressed as a percentage.Items marked as "not applicable" were not taken into account when calculating the total score.Subsequently, the overall rating was categorized into one of 3 groups based on the total score: poor (<25%), fair (25%-75%), or good (>75%).After the independent assessments, the results were compared and discussed.In cases where consensus was not reached, the opinion of a 3rd author, A.M., was sought.

Data extraction, grouping and analysis
Using a standardized form, researchers B.Z. and M.O.independently extracted specific data from each study, focusing on descriptors such as sample size, age, gender, and health conditions.Additionally, outcomes obtained using different protocols or the differences between them were  Item 1. Was the study question or objective clearly stated?Item 2. Were eligibility/selection criteria for the study population prespecified and clearly described?Item 3. Were the participants in the study representative of those who would be eligible for the test/service/intervention in the general or clinical population of interest?Item 4. Were all eligible participants that met the prespecified entry criteria enrolled?Item 5. Was the sample size sufficiently large to provide confidence in the findings?Item 6. Was the test/service/intervention clearly described and delivered consistently across the study population?Item 7. Were the outcome measures prespecified, clearly defined, valid, reliable, and assessed consistently across all study participants?Item 8. Were the people assessing the outcomes blinded to the participants' exposures/interventions? Item 9. Was the loss to follow-up after baseline 20% or less?Were those lost to follow-up accounted for in the analysis?Item 10.Did the statistical methods examine changes in outcome measures from before to after the intervention?Were statistical tests done that provided p values for the pre-to-post changes?Item 11.Were outcome measures of interest taken multiple times before the intervention and multiple times after the intervention (i.e., did they use an interrupted time-series design)?Item 12.If the intervention was conducted at a group level (e.g., a whole hospital, a community, etc.) did the statistical analysis take into account the use of individual-level data to determine effects at the group level?Legend: CCC, yes.CC, unclear/cannot determine.C, no.NR, not reported; NA, not applicable.The bold values are the median (1st and 3rd quartile).
Frontiers in Physiology frontiersin.orgcollected, employing measures of central tendency (mean) and dispersion (standard deviation or range).The probability of type I error and/or the effect size (η p 2 , Cohen's d) were also determined.Moreover, reliability measures, including the intraclass correlation coefficient (ICC), standard error of measurement (SEM), minimal detectable change (MDC), or smallest detectable difference (SDD) were extracted.The data were systematically compared and discussed.In cases where consensus was not reached, a 3rd researcher, A.M., made the final decision.The extracted data were then categorized based on the variables of the "SEBT group" protocols to understand the impact of each variable on the test outcomes.Subsequently, the data were analyzed in a narrative format.

Results
In the systematic review, 19 studies were ultimately included.A summary of the database search and selection process is depicted in Figure 1.The methodological quality of the 6 studies included in the systematic review was assessed as "good," while the remaining 13 were assessed as "fair."A detailed assessment of the studies can be found in Table 2.In Table 3, a summary of the methods for standardizing test protocols in the studies included in the review is presented.

Choice of calculation method
A study conducted by Sokulska et al. (2024) demonstrated that the method of calculating scores based on the maximum repetition, compared to the average of 3 repetitions, yields statistically significantly higher normalized scores in each direction and a higher composite score (the differences range between 1.8% and 2.8%).Conversely, the study by Shaffer et al. (2013) indicates that the method of calculating scores based on the average of 3 repetitions, compared to the method based on the maximum repetition, is characterized by more favorable reliability indicators, i.e., higher ICC as well as lower SEM and MDC.In contrast to these findings, the study by Kattilakoski et al. (2023) which compared methods based on the first 3 repetitions, the best 3 repetitions, and the maximum repetition, showed that the method of calculating results does not affect reliability indicators.Detailed data can be found in Table 4.

Arm movement restriction
Most studies indicate that the test procedure without arm movement restrictions, compared to the procedure with restrictions, yields statistically significantly higher scores in each direction and a higher composite score, regardless of the age and gender of the test subject, as well as the use of footwear during the test (Hébert-Losier, 2017;Objero et al., 2019;Muehlbauer et al., 2022b;2022a;Sogut et al., 2022).Conversely, the study by Sogut et al. (2022) indicates that the test procedure with arm movement restrictions, compared to the procedure without restrictions, is characterized by better reliability indicators, i.e., higher ICC values and lower SEM and MDC values.Detailed data can be found in Table 5.

Wearing footwear during testing
The results of the study by Sogut et al. (2022) indicate that performing the test in footwear, compared to testing without footwear, yields statistically significantly higher score in the PM direction and composite score (regardless of whether the trial was performed with arm movement restrictions), as well as in the ANT direction (in the case of the procedure with arm movement restrictions).Conversely, the results of the study by Park et al. (2023) indicate that performing the test in footwear with regular insoles, compared to performing the test barefoot, yields statistically significantly higher scores in the PL direction for the dominant leg.Additionally, performing the test in footwear with textured insoles, compared to performing the test barefoot, yields statistically significantly higher scores in the PM and PL directions for both legs.Detailed data can be found in Table 6.

Warm-up
The results of the study by Bizzini et al. (2013) indicate that preceding the test with the "FIFA 11+" warm-up significantly increases the composite score compared to testing without a warm-up.Similarly, the study by Imai et al. (2014) shows that a warm-up including trunk stabilization exercises significantly increases the scores in the PM and PL directions, as well as the composite score, compared to testing without a warmup.Conversely, the study conducted by Gogte et al. (2017) did not show statistically significant differences in the composite score for tests preceded by active, passive, and combined warm-ups.Additionally, the results of the study conducted by Belkhiria-Turki et al. (2014)   unclear, trivial, or small effect size of including static or dynamic stretching exercises in the warm-up preceding the test, regardless of the number of repetitions.Detailed data can be found in Table 7.

Number of familiarization repetitions
The results of studies conducted by Linek et al. (2017) and Kattilakoski et al. (2023) indicate that achieving a plateau in reach distances requires 6 familiarization repetitions.A test preceded by 6 familiarization repetitions is characterized by the following reliability indicators: ICC = 0.57-0.82,SEM = 3.30-5.90,and MDC = 7.68-13.5.Conversely, the studies by Munro and Herrington (2010), as well as Robinson and Gribble (2008), suggest that a plateau can be reached after 4 familiarization repetitions.A test preceded by 4 familiarization repetitions is characterized by the following reliability indicators: ICC = 0.84-0.92,SEM = 2.21-2.94,and SDD = 6.13-8.15.Additionally, the study conducted by Onofrei et al. (2019) indicates that for athletes with experience in performing the test, 1 familiarization repetition is sufficient to achieve consistent results.In the situation where the test is preceded by 1 familiarization repetition, the reliability indicators are: ICC = 0.90-0.94,SEM = 0.91-2.86,and MDC = 2.54-7.94.Detailed data can be found in Table 8.

Using a dedicated test kit during testing
The study conducted by Bulow et al. (2019) indicates that performing the test using a dedicated kit, compared to testing without equipment (using tape on the floor), results in statistically significantly lower scores for all directions and the composite score.Conversely, the results of the study by Jagger et al. (2020) show that performing the test using a dedicated kit, compared to testing without equipment, results in statistically significantly higher scores exclusively for the PL direction, with no significant differences for the other directions.Detailed data can be found in Table 9.

Heel lifting restriction
A database search did not reveal any studies on the impact of heel elevation on test outcomes.

Discussion
The aim of this systematic review was to compile studies verifying the impact of protocol variables on the outcomes of the "SEBT group," including choice of calculation method, restrictions on arm movements, testing with footwear, warm-up procedures, the number of familiarization repetitions, the use of a dedicated test kit and restrictions on heel lifting.The study found that the choice of calculation method and arm movement restrictions have a significant impact on test results.Conversely, the influence of footwear, warm-up, and the use of a dedicated test kit remains unclear based on the available research.It also appears that the number of familiarization repetitions required to reach a plateau varies depending on the biological maturity level of the tested individual.A database search did not reveal any studies on the impact of heel elevation on test outcomes.As the first review to systematically compile the impact of these variables on the outcomes achieved, it offers a valuable source of information that can be useful from both a research and clinical practice perspective.
The results of studies by Sokulska et al. (2024) indicate that choosing a calculation method based on the maximum repetition, as opposed to a method based on the average of 3 repetitions, leads to higher test scores.Conversely, authors of studies analyzing the impact of the choice of calculation method on test reliability indicators have reached somewhat different conclusions.According to Shaffer et al. (2013) reliability indicators are more favorable for the method based on the average of 3 repetitions compared to the method based on the maximum repetition.In contrast, to Kattilakoski et al. (2023) the calculation method does not affect reliability indicators.This study analyzed methods based on the first 3 repetitions, the best 3 repetitions, and the maximum repetitions.The discrepancies in the results may stem from differences in the test protocols.Shaffer et al. (2013) did not precede the test with a warm-up and performed 3 test repetitions following 6 familiarization repetitions.In contrast, Kattilakoski et al. (2023) preceded the test with a warm-up consisting of 5 min of walking followed by 5 min of jogging at a self-selected pace, and conducted 5 test repetitions preceded by 1 familiarization repetition.Based on the above data, it can be speculated that preceding the test with a warm-up reduces the dispersion of individual repetition results, which in turn makes the choice of calculation method less significant.The reduction in the variability of individual repetition results may be associated with the optimization of the postural control system due to warm-up, as observed by Paillard et al. (2018).
Research findings indicate that a test procedure allowing arm movements (compared to one with restrictions) enables achieving statistically significantly higher scores, at the cost of slightly reduced reliability (Hébert-Losier, 2017;Objero et al., 2019;Muehlbauer et al., 2022b;2022a;Sogut et al., 2022).Higher scores obtained during testing with unrestricted arm movements can be attributed to at least 2 reasons.First, the arms act as a counterbalance, making it easier to maintain the vertical projection of the center of gravity within the base of support (Roos et al., 2008).Second, moving the mass away from the axis of rotation (outstretching the arms) increases the moment of inertia, which reduces angular accelerations, giving more time to perform corrective movements (Hill et al., 2019).
The study results do not allow for a definitive determination of the impact of wearing footwear during the test on its outcomes (Sogut et al., 2022;Park et al., 2023).Additionally, interpretation is hindered by the lack of mention in the cited studies regarding the standardization of the test protocol concerning heel elevation restrictions.On one hand, it can be speculated that wearing footwear during the test compensates for limitations in ankle dorsiflexion range of motion (by elevating the heel), which may be particularly important in testing procedures that prohibit heel elevation (Basnett et al., 2013;Olszewski et al., 2024).However, it is important to remember that differences in footwear design can be a confounding factor in the results.On the other hand, it can be assumed that performing the test barefoot allows for the precise acquisition of sensory information through the receptors located in the foot (Viseux, 2020).Additionally, it is important to note that moving in footwear is currently more natural for people than moving barefoot, making their postural control system operate in conditions closer to those encountered in daily life when tested with footwear.
The study results indicate an ambiguous impact of warm-up on "SEBT group" outcomes.The work of Bizzini et al. (2013) showed that after performing the FIFA 11+ warm-up, the composite score increased significantly.Imai et al. (2014) observed that a warm-up consisting of trunk stabilization exercises increased normalized reaches in the PM and PL directions as well as the composite score.This study did not observe changes in the effect of a warm-up consisting of conventional trunk exercises.Gogte et al. (2017) did not observe differences in the effects of passive, active, and mixed warm-ups on the composite score.Belkhiria-Turki et al. (2014) examining the impact of a warm-up consisting of a 5-min run combined with static or dynamic stretching of varying volumes, observed mainly unclear, trivial, or small effects.The discrepancy in research results is likely due to the diversity of warm-up protocols used by the authors.It can be assumed that each protocol prepared the body differently for the test task, which consequently led to differences in the results (van den Tillaar et al., 2019;McGowan et al.,2015).This observation indicates the need to develop a standardized warm-up protocol tailored to the needs of the test.
Most studies indicate that to stabilize the results in the "SEBT group," it is necessary to precede the test with 4-6 familiarization repetitions in each direction for each leg.Based on the findings of Munro and Herrington (2010), as well as Robinson and Gribble (2008), it can be assumed that for testing adults, the number of familiarization repetitions should not be fewer than 4, while for adolescents, it should not be fewer than 6.In contrast, a study conducted by Onofrei et al. (2019) indicates that stable results can be achieved after just 1 familiarization repetition in the case of adult elite athletes who have experience performing the "SEBT group."Based on the above observations, it can be assumed that an important criterion for selecting the number of familiarization repetitions is the level of biological development of the test subject.As indicated by Kiers et al. (2022), with the advancement of biological maturity, the efficiency of the postural control system increases, which, as the authors of this review suggest, may affect the effectiveness of adapting to the demands of the balance control test.
The comparison of research results conducted by Bulow et al. (2019) andJagger et al. (2020) reveals ambiguity regarding the impact of using a dedicated test kit on the obtained results.Despite this ambiguity, the practical benefits advocate for testing with the use of a dedicated test kit.additional equipment, and specifically prepare the body for the test task.

Number of familiarization repetitions
Conducting 4-6 familiarization repetitions before the test is recommended to enable participants to adapt adequately to the test requirements.Typically, 4 repetitions are sufficient for adults, while up to 6 repetitions may be necessary for adolescents due to their ongoing biological development.For elite athletes experienced in performing the test, fewer familiarization repetitions may be appropriate.

Using a dedicated test kit during testing
The use of a specific test kit is recommended to standardize the testing process.This approach simplifies the procedure, promotes consistency, and helps in comparing results more effectively across various studies.While there are mixed results regarding its impact, the practical benefits of using a dedicated test kit, such as ease of use and standardization, are undeniable.

Limitations
The analysis focusing solely on the healthy population introduces certain limitations to our systematic review, considering the application of these tests in specific clinical entities.The diversity of purposes and clinical contexts in which these tests are used may justify deviations from the proposed protocols.Our review aimed to explore how specific protocol variations affect test outcomes, but it was not intended to establish rigid guidelines for conducting the test across every population.Therefore, while we strive to provide general guidelines on protocols, it's important to remember the need for their adaptation to the specific clinical needs and characteristics of the populations being studied.

Conclusion
In conclusion, this review highlights the significant role of the choice of calculation method and arm movement restrictions on the outcomes of the "SEBT group."It also notes the ambiguous impact of wearing footwear during testing, warm-up, and the use of a dedicated test kit on the results.Additionally, it appears that the number of familiarization repetitions required to reach a plateau varies depending on the biological development level of the tested individual.Future research should focus on developing a standardized warm-up protocol tailored to the needs of the "SEBT group," and verifying the impact of heel lifting during testing on the obtained results.

TABLE 1
Search strategy.
Search 4 ("Y-Balance Test" OR YBT OR "Star Excursion Balance Test" OR SEBT OR "Star Excursion Test") AND (shoe OR footwear OR barefoot OR sole) Search 5 ("Y-Balance Test" OR YBT OR "Star Excursion Balance Test" OR SEBT″OR "Star Excursion Test") AND (hand OR arm OR upper limb OR heel) Search 6 ("Y-Balance Test" OR YBT OR "Star Excursion Balance Test" OR SEBT OR "Star Excursion Test") AND (procedure OR guideline OR manual OR standard) Search 7 ("Y-Balance Test" OR YBT OR "Star Excursion Balance Test" OR SEBT OR "Star Excursion Test") AND ((kit OR set OR suit) AND test) Wiley ("Y-Balance Test" OR YBT OR "Star Excursion Balance Test" OR "Star Excursion Test" OR SEBT) AND ( "warm-up" OR ((attempt* OR trial* OR repetition*) AND number*) OR ((maximum OR average OR mean) AND reach)

TABLE 2
Methodological quality of included studies.

TABLE 3
Method of standardizing test protocols in included studies.
Item 1. Was a warm-up conducted before the test?Item 2. Were familiarization trials performed before the test?If so, what was their number?Item 3. Were arm movements restricted?Item 4. Was the test conducted barefoot?Item 5. Was the heel lift allowed?Item 6. Was the order of trials in each direction specified?Item 7. Were errors that resulted in a trial being disqualified specified?Legend: CCC, yes.CC, unclear.C, no.P, purpose of the study to examine this variable under different conditions

TABLE 4
Summary of study results on the impact of the choice of calculation method on test outcomes.%LLL, percentage of lower limb length; ANT/PM/PL and CS, anterior/posterolateral/posteromedial reach and composite score; LL/RL, left/right leg; KL/SL, kicking/stance leg; max/avg, calculation method based on maximum repetition/average of 3 repetitions, F3/B3, calculation method based on first 3 repetitions/best 3 repetition; %DIFF, percentage difference between max and avg; ICC (95%CI), interclass correlation coefficient with 95% confidence interval; SEM, standard error of measurement percentage; MDC, minimal detectable change; p-value, probability of type I error, SDstandard deviation.

TABLE 5
Summary of study results on the impact of arm movement restrictions on test outcomes.

TABLE 5 (
Continued) Summary of study results on the impact of arm movement restrictions on test outcomes.

TABLE 6
Summary of study results on the impact of wearing footwear during testing on test outcomes.
%LLL, percentage of lower limb length; ANT/PM/PL and CS, anterior/posterolateral/posteromedial reach and composite score; WS/B, with shoes/barefoot; SRI/STI, shoes with regular/texture insoles; MS, minimalist shoes; ICC, interclass correlation coefficient; SEM, standard error of measurement; MDC, minimal detectable change; p-value, probability of type I error; SD, standard deviation

TABLE 7
Summary of study results on the impact of warm-up on test outcomes.

TABLE 7 (
Continued) Summary of study results on the impact of warm-up on test outcomes.Summary of study results on the impact of number of familiarization repetitions on test outcomes.PM/PL and CS, anterior/anteromedial/anterolateral/medial/lateral; posterior/posteromedial/posterolateral reach and composite score; RL/LL, right/left leg; ICC, interclass correlation coefficient; SEM, standard error of measurement; MDC, minimal detectable change; p-value, probability of type I error; SD, standard deviation.

TABLE 9
Summary of study results on the impact of number of dedicated kit on test outcomes.
%LLL, percentage of lower limb length, ANT/PM/PL and CS, anterior/posterolateral/posteromedial reach and composite score; RL/LL, right/left leg; p-value, probability of type I error, SD, standard deviation.